Due dates:
Students will be randomly assigned to groups.
import numpy as np
# set random seed to course number
np.random.seed(2453)
students = ['JB', 'CC', 'JZC','JC', 'SG', 'PL', 'JR', 'XT', 'SV', 'LW', 'JY']
print('There are',len(students),'students in the class. '
'Randomly select four groups of two and one group of three.\n')
# student initals
students = np.array(students)
# 1. select 4 groups of size 2 without replacement
groups = np.random.choice(students, size=(4,2), replace=False)
# remaining students will be in 1 group of size 3
#
# 2. flatten list of students in a pair
groups2 = [item for sublist in groups.tolist() for item in sublist]
# 3. list of students that didn't get assigned to a pair
group3 = list(set(students).difference(set(groups2)))
# print out groups
for i in range(len(groups)):
print('Group', i+1, 'is:', groups[i,0],'and',groups[i,1])
print('Group',len(groups)+1,'is:', group3[0],',',group3[1], 'and', group3[2])
nbdime is a very useful Python library for working with .ipynb files with git and Github.
Your presentation should demonstrate how your interactive Jupyter notebook answers the questions. Your presentation should use RISE - python to create slides, or xarigan - R. You may also demonstrate parts of your project using voila - python. The reason for using these presentation tools is so that your presentation slides are reproducible.
You and your partner will create user documentation for the web page. The documentation should be done
The time allotted for each presentation is 10 minutes plus 5 minutes for questions/discussion (15 minutes for the group with three people). The time that each person speaks should be approximately equal (i.e., 5 minutes). This time limit will be enforced. If you exceed the time limit then you will be asked to stop the presentation. This means that you should rehearse your presentation timing before you present to the class.
The goal of the presentation is to effectively communicate how librarians can use your web page to answer the questions (i.e., the communication is aimed at a non-technical, but educated, audience). This does not mean that you should not include technical details, but you should aim to communicate the findings to an audience without a background in statistics, math, or computer science.
You will need to remind us about the project, but only tell us what we really need to know. We are curious about the results, and how you present the results, but they are not the only purpose of this presentation. So, what should you include? Examples, of questions to consider as you prepare your presentation are:
The Jupyter notebook or R Markdown file that you used for the presentation should be pushed to your Github repository for this assignment by **March 31, 9:45**. Your presentation will be evaluated according to this rubric.
The user documentation should explain to users what data is being displayed on your web page. For example, if you use the data to do a calculation or create a plot then explain why the calculation was done, and how it should be interpreted.
The documentation should be broken into sections that correspond to the sections of your web page.
The user documentation should be done using a Jupyter notebook/R markdown document. Ideally your group would find a way to incorporate the documentation into the design of the web page, although this isn't necessary.
Your user documentation will be evaluated for clarity and conciseness.
Titles [1-5]: There should be an appropriate title for each section of the web page.
Introductions [1-5]: What is the the purpose of each section?
Methods [1-5]: Statistical calculations and data visualizations should be clearly explained to users in each section of the web page without assuming a background in statistics, math, or computer science.
General Considerations [1-5]: The documentation should be presented in logical order, with well-organized sections, no grammatical, spelling, or punctuation errors, an appropriate level of technical detail, and be clear and easy to follow.
Workflow[1-5]: Groups should follow the project workflow by creating a branch for each member, pull requests, and merges using git and Github.
The web page be graded by evaluating:
Data visualization and web page layout will be evaluated for:
clarity (can the data and figures by clearly seen and understood by the user?),
ease of use (is the web page easy to use? for example, is it easy to navigate?), and
communication (does the web page communicate appropriate responses to user queries?).
The final Jupyter notebook or R Markdown file that you should be pushed to your Github repository for this assignment by **April 10, 23:59**.
Coming soon ...
The 307 data can be accessed using the Numina API.
Login to the dashboard and select a sensor.
Select mode.
Add a behviour zone (optional).
Select time frame.
Export CSV.
login.py and use the magic %run to read your credentials into your Jupyter notebook. The file login.py should contain:
login = "yourname@utoronto.ca"
pwd = "yourpassword"
# store login data in login.py
%run login.py
# login query as multiline formatted string
# this assumes that login and pwd are defined
# above
loginquery = f"""
mutation {{
logIn(
email:\"{login}\",
password:\"{pwd}\") {{
jwt {{
token
exp
}}
}}
}}
"""
requests library.import requests
url = 'https://api.numina.co/graphql'
mylogin = requests.post(url, json={'query': loginquery})
mylogin
A login token was successfully returned by the Numina server (i.e., a response of 200 was returned). Now, store the token in token for use in subsequent queries.
token = mylogin.json()['data']['logIn']['jwt']['token']
Note that tokens expire after 24 hours by default.
expdate = mylogin.json()['data']['logIn']['jwt']['exp']
expdate
The following query requests all devices (sensors) serial number, and rawId that can be used as a unique way to identify the device in other requests.
query1 = """
query {
devices {
count
edges {
node {
rawId
name
serialno
}
}
}
}
"""
devices = requests.post(url, json={'query': query1}, headers = {'Authorization':token})
devices.json()
Counts queries are used to get counts of objects that were observed in a given time interval.
The following query finds the number of pedestrians detected daily by the indoor sensor (the sensor that has name Streetscape - Sandbox) from 2019-12-01 to 2019-12-31.
query2 = """
query {
feedCountMetrics(
serialnos:["SWLSANDBOX1"],
startTime:"2019-12-01T00:00:00",
endTime:"2020-01-01T00:00:00",
objClasses:["pedestrian"],
timezone:"America/New_York",
interval:"24h") {
edges {
node {
serialno
result
objClass
time
}
}
}
}
"""
dec2019peds = requests.post(url, json={'query': query2}, headers = {'Authorization':token})
Sample output from dec2019.json() of the daily pedestrian counts for December, 2019 is shown below:
{'data': {'feedCountMetrics': {'edges': [{'node': {'objClass': 'pedestrian',
'result': 1.0,
'serialno': 'SWLSANDBOX1',
'time': '2019-12-01T00:00:00-05:00'}},
{'node': {'objClass': 'pedestrian',
'result': 69.0,
'serialno': 'SWLSANDBOX1',
'time': '2019-12-02T00:00:00-05:00'}},
...
...
query3 = """
query {
feedHeatmaps(
serialno: "SWLSANDBOX1",
startTime:"2019-12-01T00:00:00",
endTime:"2019-12-31T00:00:00",
objClasses:["pedestrian"],
timezone:"America/New_York") {
edges {
node {
time
objClass
heatmap
}
}
}
}
"""
dec2019heat = requests.post(url, json={'query': query3}, headers = {'Authorization':token})
Sample output from dec2019heat.json() of the daily pedestrian counts for December, 2019 is shown below:
{'data': {'feedHeatmaps': {'edges': [{'node': {'heatmap': [[495, 39, 0.192],
[496, 39, 0.192],
[497, 39, 0.192],
[498, 39, 0.192],
[508, 39, 0.192],
[487, 40, 0.192],
Visualization of this data can be done in the Numina dashboard
or you can use a library such as OpenCV-Python (an R wrapper for open-CV is also available here).
This section contains sample code that can add interactivity to your web page. The dataframe raincoatdat contains data from one of the sensors at 307 from November 1 through December 31, 2019.
import pandas as pd
import numpy as np
raincoatdat = pd.read_csv('raincoatdatnovdec2019.csv')
raincoatdat.head(n=3)
ipywidgets can be used to create a pull-down menu that displays the data for a specific day during the time period.
import ipywidgets as widgets
from IPython.display import HTML
# dropdown menu of dates
dd = widgets.Dropdown(options = raincoatdat.time)
# Output widget for dataframe
out2 = widgets.Output()
# display dropdown and dataframe
display(dd, out2)
# dd_eventhand is an event handler for displaying a filtered view of the dataframe
# The callback registered must have the signature handler(change) where change is a
# dictionary holding the information about the change.
# see https://ipywidgets.readthedocs.io/en/latest/examples/Widget%20Events.html and
# the doc string for observe (i.e., print(widgets.Widget.observe.__doc__))
def dd_eventhand(change):
out2.clear_output() # clear current output
with out2:
# display three columns of dataframe filtered by date selected in dropdown
display(HTML(raincoatdat[raincoatdat['time'] == change.new][list(raincoatdat)[1:5]].to_html(index=False)))
dd.observe(dd_eventhand, names = 'value')
plotly can be used to generate interactive plots and can also be combined with ipywidgets. The example below is taken from this example.
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(x=raincoatdat.time, y=raincoatdat['pedestrians'], name="Pedestrians",
line_color='#003f5c'))
fig.add_trace(go.Scatter(x=raincoatdat.time, y=raincoatdat['bicyclists'], name="Bicyclists",
line_color='#7a5195'))
fig.add_trace(go.Scatter(x=raincoatdat.time, y=raincoatdat['cars'], name="Cars",
line_color='#ef5675'))
fig.add_trace(go.Scatter(x=raincoatdat.time, y=raincoatdat['cars'], name="Buses",
line_color='#ffa600'))
fig.update_layout(title_text='Number of Objects Detected - Under Raincoat Sensor',
xaxis_rangeslider_visible=True)
fig.show()